AITopics | pfaffian function

Collaborating Authors

pfaffian function

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Every Feedforward Neural Network Definable in an o-Minimal Structure Has Finite Sample Complexity

Kratsios, Anastasis, Cousins, Gregory, Borde, Haitz Sáez de Ocáriz, Kim, Bum Jun, Brugiapaglia, Simone

arXiv.org Machine LearningMay-11-2026

We show that, in a precise sense, a broad class of feedforward neural networks learn (have finite sample complexity) in the PAC model: every fixed finite feedforward architecture whose layers are definable in an o-minimal structure has finite sample complexity in the agnostic PAC setting, even with unbounded parameters. This covers standard fixed-size MLPs, CNNs, GNNs, and transformers with fixed sequence length, together with the operations and layers typically used in such architectures, including linear projections, residual connections, attention mechanisms, pooling layers, normalization layers, and admissible positional encodings. Hence, distribution-free learnability for modern non-recurrent architectures is not an exceptional property of particular activations or architecture-specific VC arguments, but a consequence of tame feedforward computation. Our results reposition finite-sample PAC learnability as a baseline rather than a differentiator: they shift the focus of architectural comparison toward inductive biases, symmetries and geometric priors, scalability, and optimization behaviour.

artificial intelligence, def, machine learning, (19 more...)

arXiv.org Machine Learning

2605.07097

Country:

North America > Canada (0.46)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
Asia > Japan > Honshū (0.28)
North America > United States (0.28)

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Gradient Descent with Provably Tuned Learning-rate Schedules

Sharma, Dravyansh

arXiv.org Artificial IntelligenceDec-5-2025

Gradient-based iterative optimization methods are the workhorse of modern machine learning. They crucially rely on careful tuning of parameters like learning rate and momentum. However, one typically sets them using heuristic approaches without formal near-optimality guarantees. Recent work by Gupta and Roughgarden studies how to learn a good step-size in gradient descent. However, like most of the literature with theoretical guarantees for gradient-based optimization, their results rely on strong assumptions on the function class including convexity and smoothness which do not hold in typical applications. In this work, we develop novel analytical tools for provably tuning hyperparameters in gradient-based algorithms that apply to non-convex and non-smooth functions. We obtain matching sample complexity bounds for learning the step-size in gradient descent shown for smooth, convex functions in prior work (up to logarithmic factors) but for a much broader class of functions. Our analysis applies to gradient descent on neural networks with commonly used activation functions (including ReLU, sigmoid and tanh). We extend our framework to tuning multiple hyperparameters, including tuning the learning rate schedule, simultaneously tuning momentum and step-size, and pre-training the initialization vector. Our approach can be used to bound the sample complexity for minimizing both the validation loss as well as the number of gradient descent iterations.

artificial intelligence, gradient descent, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2512.05084

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Is uniform expressivity too restrictive? Towards efficient expressivity of graph neural networks

Khalife, Sammy, Tonelli-Cueto, Josué

arXiv.org Artificial IntelligenceDec-5-2024

Uniform expressivity guarantees that a Graph Neural Network (GNN) can express a query without the parameters depending on the size of the input graphs. This property is desirable in applications in order to have number of trainable parameters that is independent of the size of the input graphs. Uniform expressivity of the two variable guarded fragment (GC2) of first order logic is a well-celebrated result for Rectified Linear Unit (ReLU) GNNs [Barcelo & al., 2020]. In this article, we prove that uniform expressivity of GC2 queries is not possible for GNNs with a wide class of Pfaffian activation functions (including the sigmoid and tanh), answering a question formulated by [Grohe, 2021]. We also show that despite these limitations, many of those GNNs can still efficiently express GC2 queries in a way that the number of parameters remains logarithmic on the maximal degree of the input graphs. Furthermore, we demonstrate that a log-log dependency on the degree is achievable for a certain choice of activation function. This shows that uniform expressivity can be successfully relaxed by covering large graphs appearing in practical applications. Our experiments illustrates that our theoretical estimates hold in practice.

activation function, gnn, query, (17 more...)

arXiv.org Artificial Intelligence

2410.0191

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Provable Hyperparameter Tuning for Structured Pfaffian Settings

Balcan, Maria-Florina, Nguyen, Anh Tuan, Sharma, Dravyansh

arXiv.org Machine LearningSep-6-2024

Data-driven algorithm design automatically adapts algorithms to specific application domains, achieving better performance. In the context of parameterized algorithms, this approach involves tuning the algorithm parameters using problem instances drawn from the problem distribution of the target application domain. While empirical evidence supports the effectiveness of data-driven algorithm design, providing theoretical guarantees for several parameterized families remains challenging. This is due to the intricate behaviors of their corresponding utility functions, which typically admit piece-wise and discontinuity structures. In this work, we present refined frameworks for providing learning guarantees for parameterized data-driven algorithm design problems in both distributional and online learning settings. For the distributional learning setting, we introduce the Pfaffian GJ framework, an extension of the classical GJ framework, capable of providing learning guarantees for function classes for which the computation involves Pfaffian functions. Unlike the GJ framework, which is limited to function classes with computation characterized by rational functions, our proposed framework can deal with function classes involving Pfaffian functions, which are much more general and widely applicable. We then show that for many parameterized algorithms of interest, their utility function possesses a refined piece-wise structure, which automatically translates to learning guarantees using our proposed framework. For the online learning setting, we provide a new tool for verifying dispersion property of a sequence of loss functions. This sufficient condition allows no-regret learning for sequences of piece-wise structured loss functions where the piece-wise structure involves Pfaffian transition boundaries.

algorithm, pfaffian function, piece-wise structure, (15 more...)

arXiv.org Machine Learning

2409.04367

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.69)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Software Engineering (0.76)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

VC dimension of Graph Neural Networks with Pfaffian activation functions

D'Inverno, Giuseppe Alessio, Bianchini, Monica, Scarselli, Franco

arXiv.org Artificial IntelligenceJan-22-2024

Graph Neural Networks (GNNs) have emerged in recent years as a powerful tool to learn tasks across a wide range of graph domains in a data-driven fashion; based on a message passing mechanism, GNNs have gained increasing popularity due to their intuitive formulation, closely linked with the Weisfeiler-Lehman (WL) test for graph isomorphism, to which they have proven equivalent. From a theoretical point of view, GNNs have been shown to be universal approximators, and their generalization capability (namely, bounds on the Vapnik Chervonekis (VC) dimension) has recently been investigated for GNNs with piecewise polynomial activation functions. The aim of our work is to extend this analysis on the VC dimension of GNNs to other commonly used activation functions, such as sigmoid and hyperbolic tangent, using the framework of Pfaffian function theory. Bounds are provided with respect to architecture parameters (depth, number of neurons, input size) as well as with respect to the number of colors resulting from the 1-WL test applied on the graph domain. The theoretical analysis is supported by a preliminary experimental study.

activation function, gnn, vc dimension, (13 more...)

arXiv.org Artificial Intelligence

2401.12362

Country:

Europe > Austria > Vienna (0.14)
Europe > Russia (0.04)
Europe > Italy (0.04)
Asia > Russia (0.04)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A topological description of loss surfaces based on Betti Numbers

Bucarelli, Maria Sofia, D'Inverno, Giuseppe Alessio, Bianchini, Monica, Scarselli, Franco, Silvestri, Fabrizio

arXiv.org Machine LearningJan-8-2024

In the context of deep learning models, attention has recently been paid to studying the surface of the loss function in order to better understand training with methods based on gradient descent. This search for an appropriate description, both analytical and topological, has led to numerous efforts to identify spurious minima and characterize gradient dynamics. Our work aims to contribute to this field by providing a topological measure to evaluate loss complexity in the case of multilayer neural networks. We compare deep and shallow architectures with common sigmoidal activation functions by deriving upper and lower bounds on the complexity of their loss function and revealing how that complexity is influenced by the number of hidden units, training models, and the activation function used. Additionally, we found that certain variations in the loss function or model architecture, such as adding an $\ell_2$ regularization term or implementing skip connections in a feedforward network, do not affect loss topology in specific cases.

activation function, complexity, loss function, (14 more...)

arXiv.org Machine Learning

2401.03824

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On the Depth of Deep Neural Networks: A Theoretical View

Sun, Shizhao (Nankai University) | Chen, Wei (Microsoft Research) | Wang, Liwei (Peking University) | Liu, Xiaoguang (Nankai University) | Liu, Tie-Yan (Microsoft Research)

AAAI ConferencesApr-19-2016

People believe that depth plays an important role in success of deep neural networks (DNN). However, this belief lacks solid theoretical justifications as far as we know. We investigate role of depth from perspective of margin bound. In margin bound, expected error is upper bounded by empirical margin error plus Rademacher Average (RA) based capacity term. First, we derive an upper bound for RA of DNN, and show that it increases with increasing depth. This indicates negative impact of depth on test performance. Second, we show that deeper networks tend to have larger representation power (measured by Betti numbers based complexity) than shallower networks in multi-class setting, and thus can lead to smaller empirical margin error. This implies positive impact of depth. The combination of these two results shows that for DNN with restricted number of hidden units, increasing depth is not always good since there is a tradeoff between positive and negative impacts. These results inspire us to seek alternative ways to achieve positive impact of depth, e.g., imposing margin-based penalty terms to cross entropy loss so as to reduce empirical margin error without increasing depth. Our experiments show that in this way, we achieve significantly better test performance.

complexity, empirical margin error, neural network, (14 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > China > Beijing > Beijing (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback